# Descriptive statistics
Cleaned_AMA_Data %>% skim(Population)
| Name | Piped data |
| Number of rows | 9 |
| Number of columns | 76 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Population | 0 | 1 | 1613690 | 756390.3 | 284124 | 1871647 | 1936836 | 2036889 | 2138833 | ▂▁▁▁▇ |
Cleaned_AMA_Data %>% skim(IGF)
| Name | Piped data |
| Number of rows | 9 |
| Number of columns | 76 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| IGF | 0 | 1 | 31429511 | 15745039 | 14395782 | 15535894 | 32840336 | 40072210 | 55200507 | ▇▁▃▂▃ |
# Histograms
ggplot(Cleaned_AMA_Data, aes(x = Population)) +
geom_histogram(bins = 10, fill = "dodgerblue", color = "black") +
labs(title = "Distribution of Population", x = "Population") +
scale_x_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = IGF)) +
geom_histogram(bins = 10, fill = "dodgerblue", color = "black") +
labs(title = "Distribution of IGF Revenue", x = "IGF Revenue") +
scale_x_continuous(labels = comma)
# Growth Rate (Percentage)
Cleaned_AMA_Data <- Cleaned_AMA_Data %>%
mutate(
Population_Growth_Rate = c(NA, diff(Population) / Population[-length(Population)] * 100),
IGF_Growth_Rate = c(NA, diff(IGF) / IGF[-length(IGF)] * 100)
)
# Plot of Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population)) +
geom_point(aes(y = Population), color = "dodgerblue") +
labs(title = "Population Trend", x = "Year", y = "Population") +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = IGF)) +
geom_point(aes(y = IGF), color = "dodgerblue") +
labs(title = "IGF Trend", x = "Year", y = "IGF") +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population, color = "Population")) +
geom_point(aes(y = Population, color = "Population")) +
geom_line(aes(y = IGF, color = "IGF")) +
geom_point(aes(y = IGF, color = "IGF")) +
labs(title = "Population vs. IGF Revenue", x = "Year", y = "Amount/Population", color = "Type") +
scale_y_continuous(labels = comma)
# Growth rate plots
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population_Growth_Rate, color = "Population Growth")) +
geom_point(aes(y = Population_Growth_Rate, color = "Population Growth")) +
geom_line(aes(y = IGF_Growth_Rate, color = "IGF Growth")) +
geom_point(aes(y = IGF_Growth_Rate, color = "IGF Growth")) +
labs(title = "Population Growth vs. IGF Growth", x = "Year", y = "Growth Rate (%)", color = "Type") +
scale_y_continuous(labels = percent_format(scale = 1)) +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") # Add horizontal line at zero
The histograms show an uneven distribution of population and IGF revenue. The population reveals the presence of two distinct population clusters.The trends plots show clear that the trend of IGF Revenue ( which experienced significant changes) is not directly linked to the trend of Population( which remained stable).
mod1 <- lm(IGF ~ Population, data = Cleaned_AMA_Data)
summary(mod1)
##
## Call:
## lm(formula = IGF ~ Population, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22373920 -1860766 -820315 5514799 19515595
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13014136.12 11608703.30 1.121 0.299
## Population 11.41 6.58 1.734 0.126
##
## Residual standard error: 14080000 on 7 degrees of freedom
## Multiple R-squared: 0.3006, Adjusted R-squared: 0.2006
## F-statistic: 3.008 on 1 and 7 DF, p-value: 0.1264
Cleaned_AMA_Data %>%
ggplot(aes(x = Population, y = IGF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(x = "Population", y = "IGF Revenue (Ghana Cedis)", title = "Linear Relationship between Population and IGF Revenue") +
scale_y_continuous(labels = scales::comma)
The F-statistic and its associated p-value (0.1264) indicate that there is no statistically significant relationship between population and IGF revenue. The R-squared is 0.3006, which means only 30.06% of the variation in IGF revenue can be explained by population even though this relationship is not statistically significant.
# Scatter Plot
ggplot(Cleaned_AMA_Data, aes(x = Population, y = IGF)) +
geom_point() +
labs(title = "Population vs. IGF Revenue", x = "Population", y = "IGF Revenue")
# Residual
ggplot(data = data.frame(residuals = residuals(mod1), fitted = fitted(mod1)), aes(x = fitted, y = residuals)) +
geom_point() + # Added geom_point()
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(title = "Residuals vs. Fitted", x = "Fitted Values", y = "Residuals")
ggplot(data = data.frame(residuals = residuals(mod1)), aes(x = residuals)) +
geom_histogram(bins = 10, fill = "skyblue", color = "black") +
labs(title = "Histogram of Residuals", x = "Residuals")
ggplot(data = data.frame(residuals = residuals(mod1)), aes(sample = residuals)) +
geom_point(stat = "qq") + # Added geom_point()
stat_qq_line() +
labs(title = "Q-Q Plot of Residuals")
# Autocorrelation (Durbin-Watson Test)
dwtest(mod1)
##
## Durbin-Watson test
##
## data: mod1
## DW = 1.3269, p-value = 0.07671
## alternative hypothesis: true autocorrelation is greater than 0
# Homoscedasticity (Breusch-Pagan Test)
bptest(mod1)
##
## studentized Breusch-Pagan test
##
## data: mod1
## BP = 2.5693, df = 1, p-value = 0.109
The scatter plot shows a positive but non-linear relationship. It shows that as population increases IGF revenue tends to increase as well. The residual plots show slight violations of linearity and normality assumptions. The Durbin-Watson test is not significant mean no autocorrelation, and the Breusch-Pagan test shows homoscedasticity.
# Transformed Models
lm(Ln_IGF ~ Ln_Pop, data = Cleaned_AMA_Data) %>% summary()
##
## Call:
## lm(formula = Ln_IGF ~ Ln_Pop, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.81443 -0.05390 0.01916 0.21476 0.51471
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.5295 2.5857 4.459 0.00294 **
## Ln_Pop 0.3987 0.1834 2.174 0.06623 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4424 on 7 degrees of freedom
## Multiple R-squared: 0.403, Adjusted R-squared: 0.3177
## F-statistic: 4.726 on 1 and 7 DF, p-value: 0.06623
Cleaned_AMA_Data$Sqrt_Population <- sqrt(Cleaned_AMA_Data$Population)
Cleaned_AMA_Data$Sqrt_IGF <- sqrt(Cleaned_AMA_Data$IGF)
lm(Sqrt_IGF ~ Sqrt_Population, data = Cleaned_AMA_Data) %>% summary()
##
## Call:
## lm(formula = Sqrt_IGF ~ Sqrt_Population, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2100.94 -152.82 -23.22 543.70 1565.49
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2780.967 1423.457 1.954 0.0917 .
## Sqrt_Population 2.188 1.121 1.952 0.0919 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1225 on 7 degrees of freedom
## Multiple R-squared: 0.3525, Adjusted R-squared: 0.26
## F-statistic: 3.811 on 1 and 7 DF, p-value: 0.09187
# Scatter Plots (Transformed Data)
ggplot(Cleaned_AMA_Data, aes(x = Ln_Pop, y = Ln_IGF)) +
geom_point() +
labs(title = "Log(Population) vs. Log(IGF Revenue)", x = "Log(Population)", y = "Log(IGF Revenue)")
ggplot(Cleaned_AMA_Data, aes(x = Sqrt_Population, y = Sqrt_IGF)) +
geom_point() +
labs(title = "Sqrt(Population) vs. Sqrt(IGF Revenue)", x = "Sqrt(Population)", y = "Sqrt(IGF Revenue)")
Even after log and square root transformations, we did not find statistically significant relationships between population and IGF revenue.
Therefore from the analysis we found no statistically significant relationship between population size and IGF revenue in this dataset. The small sample size (n=9) may have made it hard to find the pattern and limited the power to detect significant effects. Also some factors not measured meaning they are missing in the model might be the reason.
Cleaned_AMA_Data %>% skim(Population)
| Name | Piped data |
| Number of rows | 9 |
| Number of columns | 80 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Population | 0 | 1 | 1613690 | 756390.3 | 284124 | 1871647 | 1936836 | 2036889 | 2138833 | ▂▁▁▁▇ |
Cleaned_AMA_Data %>% skim(DACF)
| Name | Piped data |
| Number of rows | 9 |
| Number of columns | 80 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| DACF | 0 | 1 | 5638029 | 2694114 | 1685390 | 3607546 | 5079623 | 8043158 | 9497586 | ▂▇▂▂▇ |
# Histograms
ggplot(Cleaned_AMA_Data, aes(x = Population)) +
geom_histogram(bins = 10, fill = "dodgerblue", color = "black") +
labs(title = "Distribution of Population", x = "Population")
ggplot(Cleaned_AMA_Data, aes(x = DACF)) +
geom_histogram(bins = 10, fill = "dodgerblue", color = "black") +
labs(title = "Distribution of DACF Revenue", x = "DACF Revenue")
#Growth Rates and Per Capita Values
Cleaned_AMA_Data <- Cleaned_AMA_Data %>%
mutate(
Population_Growth_Rate = c(NA, diff(Population) / Population[-length(Population)] * 100),
DACF_Growth_Rate = c(NA, diff(DACF) / DACF[-length(DACF)] * 100)
)
# Plotting Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population)) +
geom_point(aes(y = Population), color = "dodgerblue") +
labs(title = "Population Trend", x = "Year", y = "Population") +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = DACF)) +
geom_point(aes(y = DACF), color = "dodgerblue") +
labs(title = "DACF Trend", x = "Year", y = "IGF") +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population, color = "Population")) +
geom_point(aes(y = Population, color = "Population")) +
geom_line(aes(y = DACF, color = "DACF")) +
geom_point(aes(y = DACF, color = "DACF")) +
labs(title = "Population vs. DACF Revenue", x = "Year", y = "Amount/Population", color = "Type") +
scale_y_continuous(labels = scales::comma)
# Plotting Growth Rates
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population_Growth_Rate, color = "Population Growth")) +
geom_point(aes(y = Population_Growth_Rate, color = "Population Growth")) +
geom_line(aes(y = DACF_Growth_Rate, color = "DACF Growth")) +
geom_point(aes(y = DACF_Growth_Rate, color = "DACF Growth")) +
labs(title = "Population Growth vs. DACF Growth", x = "Year", y = "Growth Rate (%)", color = "Type")+
geom_hline(yintercept = 0, linetype = "dashed", color = "red")
The histograms show an uneven distribution of population and DACF revenue. The population reveals the presence of two distinct population clusters.The trends plots show clear that the trend of DACF Revenue ( which experienced significant changes) is not directly linked to the trend of Population( which remained stable).
mod2 <- lm(DACF ~ Population, data = Cleaned_AMA_Data)
summary(mod2)
##
## Call:
## lm(formula = DACF ~ Population, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4217574 -1710429 449791 2123625 3527669
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3980686.267 2274203.376 1.750 0.124
## Population 1.027 1.289 0.797 0.452
##
## Residual standard error: 2758000 on 7 degrees of freedom
## Multiple R-squared: 0.08315, Adjusted R-squared: -0.04783
## F-statistic: 0.6348 on 1 and 7 DF, p-value: 0.4518
Cleaned_AMA_Data %>%
ggplot(aes(x = Population, y = DACF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) + # Added confidence intervals
labs(x = "Population", y = "DACF Revenue (Ghana Cedis)", title = "Linear Relationship between Population and DACF Revenue") +
scale_y_continuous(labels = scales::comma)
The linear regression results indicated no statistically significant relationship (R-squared = 0.08315, p = 0.4518). Given this model it cannot be concluded that changes in the population reliably predict changes in the DACF revenue performance, and any observed pattern could likely be due to chance. The estimate coefficient of the population is 1.027.
#Scatter Plot
ggplot(Cleaned_AMA_Data, aes(x = Population, y = DACF)) +
geom_point() +
labs(title = "Population vs. DACF Revenue",
x = "Population", y = "DACF Revenue")
# Residual
ggplot(data = data.frame(residuals = residuals(mod2),
fitted = fitted(mod2)),
aes(x = fitted, y = residuals)) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(title = "Residuals vs. Fitted",
x = "Fitted Values", y = "Residuals")
ggplot(data = data.frame(residuals = residuals(mod2)),
aes(x = residuals)) +
geom_histogram(bins = 10, fill = "skyblue", color = "black") +
labs(title = "Histogram of Residuals", x = "Residuals")
ggplot(data = data.frame(residuals = residuals(mod2)),
aes(sample = residuals)) +
stat_qq() +
stat_qq_line() +
labs(title = "Q-Q Plot of Residuals ")
# Autocorrelation
dwtest(mod2)
##
## Durbin-Watson test
##
## data: mod2
## DW = 2.3616, p-value = 0.609
## alternative hypothesis: true autocorrelation is greater than 0
# Homoscedasticity (Constant Variance of Residuals)
bptest(mod2)
##
## studentized Breusch-Pagan test
##
## data: mod2
## BP = 1.8931, df = 1, p-value = 0.1689
# Multicollinearity
#simple linear regression with one predictor(population), multicollinearity is not an issue.
# Multivariate Normality
#It is a simple linear regression with one predictor(population), multicollinearity therefore this is not an issue.
The scatter plot shows a positive but non-linear relationship. It shows that as population increases DACF revenue tends to increase as well. The histogram plot show a potential violation of the normality assumption. The Durbin-Watson test revealed no autocorrelation, and the Breusch-Pagan test shows homoscedasticity.
#Transformed Models
lm(log(DACF) ~ log(Population), data = Cleaned_AMA_Data) %>%
summary()
#
# Call:
# lm(formula = log(DACF) ~ log(Population), data = Cleaned_AMA_Data)
#
# Residuals:
# Min 1Q Median 3Q Max
# -1.1254 -0.1930 0.1834 0.4365 0.5998
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 13.8620 3.4404 4.029 0.005 **
# log(Population) 0.1109 0.2440 0.454 0.663
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Residual standard error: 0.5887 on 7 degrees of freedom
# Multiple R-squared: 0.02863, Adjusted R-squared: -0.1101
# F-statistic: 0.2063 on 1 and 7 DF, p-value: 0.6634
lm( sqrt(DACF)~sqrt(Population), data = Cleaned_AMA_Data ) %>%
summary()
#
# Call:
# lm(formula = sqrt(DACF) ~ sqrt(Population), data = Cleaned_AMA_Data)
#
# Residuals:
# Min 1Q Median 3Q Max
# -1063.9 -300.9 170.3 471.8 711.1
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 1865.5189 717.9963 2.598 0.0355 *
# sqrt(Population) 0.3630 0.5652 0.642 0.5412
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Residual standard error: 618.1 on 7 degrees of freedom
# Multiple R-squared: 0.05564, Adjusted R-squared: -0.07926
# F-statistic: 0.4125 on 1 and 7 DF, p-value: 0.5412
# Scatter Plots (Transformed Data)
ggplot(Cleaned_AMA_Data, aes(x = log(Population), y = log(DACF))) +
geom_point() +
labs(title = "Log(Population) vs. Log(DACF Revenue)",
x = "Log(Population)", y = "Log(DACF Revenue)")
ggplot(Cleaned_AMA_Data, aes(x = log(Population), y = log(DACF))) +
geom_point() +
labs(title = "Sqrt(Population) vs. Sqrt(DACF Revenue)",
x = "Sqrt(Population)", y = "Sqrt(DACF Revenue)")
The linear regression results earlier indicated that the relationship between population size and DAGF revenue is not statistically significant. After trying log , square root, transformations and regression models, We did not find statistically significant relationships between Population and DACF Revenue.Therefore in this data, population size in itself does not appear to be a primary driver of DAGF revenue.
The recurrent expenditure values are not available.
mod3 <- lm(Capital_Expenditure ~ Population, data = Cleaned_AMA_Data)
summary(mod3)
##
## Call:
## lm(formula = Capital_Expenditure ~ Population, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15480921 -4324721 -976274 9154924 11443212
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -230353.553 8235549.845 -0.028 0.978
## Population 8.073 4.668 1.729 0.127
##
## Residual standard error: 9987000 on 7 degrees of freedom
## Multiple R-squared: 0.2994, Adjusted R-squared: 0.1993
## F-statistic: 2.991 on 1 and 7 DF, p-value: 0.1274
Cleaned_AMA_Data %>%
ggplot(aes(x = Population, y = Capital_Expenditure)) +
geom_point()+
geom_smooth(method = "lm", se = TRUE) + labs(x = "Population", y = "Capital Expenditure", title = "Linear Relationship Population and Capital Expenditure")+
scale_y_continuous(labels = scales::comma)
From the linear regression result, the F-statistic and its associated p-value (0.1274) are not statistically significant. Therefore the analysis found no statistically significant relationship between population and Capital Expenditure .Given this model it cannot be concluded that changes in the population reliably predict changes in the capital expenditure, and any observed pattern could likely be due to chance.
For every one-unit increase in population, capital expenditure is estimated to increase by 8.073 The Multiple R-squared (0.2994) indicates that 29.94% of the variation in capital expenditure can be explained by the model (population).
#Scatter Plot
ggplot(Cleaned_AMA_Data, aes(x = Population, y = Capital_Expenditure)) +
geom_point() +
labs(title = "Population vs. Capital Expenditure",
x = "Population", y = "Capital Expenditure")
# Residual
ggplot(data = data.frame(residuals = residuals(mod3),
fitted = fitted(mod3)),
aes(x = fitted, y = residuals)) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(title = "Residuals vs. Fitted",
x = "Fitted Values", y = "Residuals")
ggplot(data = data.frame(residuals = residuals(mod3)),
aes(x = residuals)) +
geom_histogram(bins = 10, fill = "skyblue", color = "black") +
labs(title = "Histogram of Residuals", x = "Residuals")
ggplot(data = data.frame(residuals = residuals(mod3)),
aes(sample = residuals)) +
stat_qq() +
stat_qq_line() +
labs(title = "Q-Q Plot of Residuals ")
# Autocorrelation
dwtest(mod3)
##
## Durbin-Watson test
##
## data: mod3
## DW = 1.5868, p-value = 0.1604
## alternative hypothesis: true autocorrelation is greater than 0
# Homoscedasticity (Constant Variance of Residuals)
bptest(mod3)
##
## studentized Breusch-Pagan test
##
## data: mod3
## BP = 3.2215, df = 1, p-value = 0.07268
# Multicollinearity
#simple linear regression with one predictor(population), multicollinearity is not an issue.
# Multivariate Normality
#It is a simple linear regression with one predictor(population), multicollinearity therefore this is not an issue.
The scatter plot shows that as population increases Capital Expenditure revenue tends to increase as well. But the relationship between them is non-linear though positive. There’s a cluster of points with lower population and lower Capital Expenditure, and another cluster with higher population and higher Capital Expenditure.
The histogram of the residuals is not symmetric indicating it is not normal. Linearity is sightly also not satisfied. No autocorrelation. The residuals are uncorrelated.Homoscedasticity is satisfied.The residuals have constant variance. And since we are dealing a simple linear regression with one predictor(population), multicollinearity is not an issue.
Cleaned_AMA_Data$Ln_Population <- log(Cleaned_AMA_Data$Population)
Cleaned_AMA_Data$Ln_Capital_Expenditure <- log(Cleaned_AMA_Data$Capital_Expenditure)
#Transformed Models
mod4 <- lm(log(Capital_Expenditure) ~ log(Population), data = Cleaned_AMA_Data)
summary(mod4)
#
# Call:
# lm(formula = log(Capital_Expenditure) ~ log(Population), data = Cleaned_AMA_Data)
#
# Residuals:
# Min 1Q Median 3Q Max
# -2.09346 -0.17567 0.05599 0.80484 0.85654
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -1.7563 5.8693 -0.299 0.7734
# log(Population) 1.2423 0.4163 2.984 0.0204 *
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Residual standard error: 1.004 on 7 degrees of freedom
# Multiple R-squared: 0.5598, Adjusted R-squared: 0.4969
# F-statistic: 8.903 on 1 and 7 DF, p-value: 0.02041
# Scatter Plots (Transformed Data)
ggplot(Cleaned_AMA_Data, aes(x = log(Population), y = log(Capital_Expenditure))) +
geom_point() +
geom_smooth(method = "lm", se = TRUE)
labs(title = "Log(Population) vs. Log(Capital Expenditure)",
x = "Log(Population)", y = "Log(Capital Expenditure)")
# $x
# [1] "Log(Population)"
#
# $y
# [1] "Log(Capital Expenditure)"
#
# $title
# [1] "Log(Population) vs. Log(Capital Expenditure)"
#
# attr(,"class")
# [1] "labels"
After the transformation the linear regression result show a statistically significant with p-value (0.0204) for log(Population). It shows there is a significant positive relationship between the log of population and the log of capital expenditure.
For every one-unit increase in log(Population), log(Capital_Expenditure) is expected to increase by 1.2423 units. This means a 1% increase in population corresponds to a 1.24% increase in capital expenditure. Multiple R-squared (0.5598) indicates that 55.98% of the variation in log of capital expenditure can be explained by the log of population.
cor.test(Cleaned_AMA_Data$Ln_Pop, Cleaned_AMA_Data$Ln_Cap_Expenditure)
##
## Pearson's product-moment correlation
##
## data: Cleaned_AMA_Data$Ln_Pop and Cleaned_AMA_Data$Ln_Cap_Expenditure
## t = 2.9838, df = 7, p-value = 0.02041
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1671597 0.9435049
## sample estimates:
## cor
## 0.7482184
The correlation between them is significant with correlation coefficient of 0.748, which is pretty strong.
# Calculate Per Capita Values
Cleaned_AMA_Data$Capital_Exp_Per_Capita <- Cleaned_AMA_Data$Capital_Expenditure / Cleaned_AMA_Data$Population
# Plotting Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population)) +
geom_point(aes(y = Population), color = "dodgerblue") +
labs(title = "Population Trend", x = "Year", y = "Population") +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Capital_Expenditure, color = "Capital Expenditure")) +
geom_point(aes(y = Capital_Expenditure, color = "Capital Expenditure")) +
labs(title = " Expenditure Trends", x = "Year", y = "Amount", color = "Type") +
theme(axis.title.y.right = element_text(vjust=2))
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Population, color = "Population")) +
geom_point(aes(y = Population, color = "Population")) +
geom_line(aes(y = Capital_Expenditure, color = "Capital Expenditure")) +
geom_point(aes(y = Capital_Expenditure, color = "Capital Expenditure")) +
labs(title = "Population and Capital Expenditure Trends", x = "Year", y = "Amount", color = "Type") +
scale_y_continuous(labels = comma, sec.axis = sec_axis(~., name = "Population")) +
theme(axis.title.y.right = element_text(vjust=2))
# Per Capita Analysis
average_capita <- mean(Cleaned_AMA_Data$Capital_Exp_Per_Capita)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Capital_Exp_Per_Capita, color = "Capital Exp. Per Capita")) +
geom_point(aes(y = Capital_Exp_Per_Capita, color = "Capital Exp. Per Capita")) +
geom_hline(yintercept = average_capita, linetype = "dashed", color = "red")+
labs(title = "Capital Expenditure Per Capita Over Time", x = "Year", y = "Ghana Cedis Per Capita", color = "Type") +
scale_y_continuous(labels = comma)
Using total revenue growth rate and infrastructure delivery (capital expenditure per capita).
# Descriptive statistics
Cleaned_AMA_Data %>% skim(Capital_Exp_Per_Capita)
| Name | Piped data |
| Number of rows | 9 |
| Number of columns | 83 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Capital_Exp_Per_Capita | 0 | 1 | 7.18 | 5.05 | 0.73 | 3.08 | 5.64 | 12.8 | 13.58 | ▇▅▂▁▇ |
Cleaned_AMA_Data %>% skim(TtRev_Growth_Rate)
| Name | Piped data |
| Number of rows | 9 |
| Number of columns | 83 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| TtRev_Growth_Rate | 1 | 0.89 | -8.98 | 33.24 | -81.19 | -14.2 | -1.61 | 5.94 | 29.62 | ▂▁▂▇▃ |
# Histograms
ggplot(Cleaned_AMA_Data, aes(x = Capital_Exp_Per_Capita)) +
geom_histogram(bins = 10, fill = "dodgerblue", color = "black") +
labs(title = "Distribution of Capital expenditure per capita", x = "Capital expenditure per capita") +
scale_x_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = TtRev_Growth_Rate)) +
geom_histogram(bins = 10, fill = "dodgerblue", color = "black") +
labs(title = "Distribution of Total Revenue Growth Rate", x = "Total revenue growth rate") +
scale_x_continuous(labels = percent)
# Plotting Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = TtRev_Growth_Rate, color = "Total Revenue Growth Rate")) +
geom_point(aes(y = TtRev_Growth_Rate, color = "Total Revenue Growth Rate")) +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
geom_line(aes(y = Capital_Exp_Per_Capita, color = "Capital Expenditure Per Capita")) +
geom_point(aes(y = Capital_Exp_Per_Capita, color = "Capital Expenditure Per Capita")) +
labs(
title = "Total Revenue Growth Rate vs. Capital Expenditure Per Capita",
x = "Year",
y = "Total Revenue Growth Rate (%)"
) +
scale_y_continuous(
labels = percent_format(scale = 1),
sec.axis = sec_axis(~., name = "Capital Expenditure Per Capita")
) +
scale_color_manual(
values = c("Total Revenue Growth Rate" = "lightseagreen", "Capital Expenditure Per Capita" = "indianred"),
name = "Type"
) +
theme(axis.title.y.right = element_text(vjust = 2))
The histograms show an uneven distribution of Total revenue growth rate and Capital expenditure per capita. The Total revenue growth rate reveals the presence of two distinct clusters.The trends plots show clear that the trend of Total revenue growth rate ( which experienced significant changes) is not directly linked to the trend of Capital expenditure per capita( which remained stable).
mod5 <- lm(Capital_Exp_Per_Capita ~ TtRev_Growth_Rate, data = Cleaned_AMA_Data)
summary(mod5)
##
## Call:
## lm(formula = Capital_Exp_Per_Capita ~ TtRev_Growth_Rate, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.2716 -3.7391 -0.3963 4.0043 6.9698
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.83859 2.01599 3.888 0.00809 **
## TtRev_Growth_Rate 0.05206 0.06229 0.836 0.43528
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.478 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.1043, Adjusted R-squared: -0.045
## F-statistic: 0.6986 on 1 and 6 DF, p-value: 0.4353
ggplot(Cleaned_AMA_Data, aes(x = TtRev_Growth_Rate, y = Capital_Exp_Per_Capita)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE)+
labs(title = "Revenue Growth vs. Capital Expenditure (Per Capita)",
x = "Total Revenue Growth Rate (%)",
y = "Capital Expenditure Per Capita")
The regression result show there no statistically significant relationship between total revenue growth rate and infrastructure delivery (capital expenditure per capita) with p-value (0.43528) is greater than 0.05 significance level. This means that changes in revenue growth do not significantly predict changes in capital expenditure per capita in this model. The R-squared (0.1043) indicates only 10.43% of the variation in capital expenditure per capita can be explained by revenue growth (total revenue growth rate)
#Transformed Models
lm(log(Capital_Exp_Per_Capita) ~ log(TtRev_Growth_Rate), data = Cleaned_AMA_Data) %>%
summary()
#
# Call:
# lm(formula = log(Capital_Exp_Per_Capita) ~ log(TtRev_Growth_Rate),
# data = Cleaned_AMA_Data)
#
# Residuals:
# 2 4 7
# -0.20616 0.30053 -0.09437
# attr(,"label")
# [1] "Capital Expenditure"
# attr(,"format.spss")
# [1] "F8.0"
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -1.7036 0.5923 -2.876 0.213
# log(TtRev_Growth_Rate) 1.3160 0.2299 5.725 0.110
#
# Residual standard error: 0.3765 on 1 degrees of freedom
# (6 observations deleted due to missingness)
# Multiple R-squared: 0.9704, Adjusted R-squared: 0.9408
# F-statistic: 32.77 on 1 and 1 DF, p-value: 0.1101
cor.test(Cleaned_AMA_Data$TtRev_Growth_Rate, Cleaned_AMA_Data$Capital_Exp_Per_Capita, use = "complete.obs")
#
# Pearson's product-moment correlation
#
# data: Cleaned_AMA_Data$TtRev_Growth_Rate and Cleaned_AMA_Data$Capital_Exp_Per_Capita
# t = 0.8358, df = 6, p-value = 0.4353
# alternative hypothesis: true correlation is not equal to 0
# 95 percent confidence interval:
# -0.4942023 0.8371108
# sample estimates:
# cor
# 0.322932
The log transformation and correlation still show a non-significant results.
Cleaned_AMA_Data$Expenditure_Growth <- c(NA, diff(Cleaned_AMA_Data$Total_Expenditure) / Cleaned_AMA_Data$Total_Expenditure[-nrow(Cleaned_AMA_Data)]) * 100
mod6 <- lm(Capital_Exp_Per_Capita ~ Expenditure_Growth, data = Cleaned_AMA_Data)
summary(mod6)
##
## Call:
## lm(formula = Capital_Exp_Per_Capita ~ Expenditure_Growth, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.0062 -2.8932 -0.2143 1.9967 7.1834
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.50991 1.83291 4.097 0.00638 **
## Expenditure_Growth 0.08746 0.07122 1.228 0.26545
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.174 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.2008, Adjusted R-squared: 0.06765
## F-statistic: 1.508 on 1 and 6 DF, p-value: 0.2654
ggplot(Cleaned_AMA_Data, aes(x = Expenditure_Growth, y = Capital_Exp_Per_Capita)) +
geom_point() + geom_smooth(method = "lm", se = TRUE)+
labs(title = "Expenditure Growth vs. Capital Expenditure (Per Capita)",
x = "Expenditure Growth Rate (%)",
y = "Capital Expenditure Per Capita")
lm(log(Capital_Exp_Per_Capita) ~ Expenditure_Growth, data = Cleaned_AMA_Data) %>%
summary()
##
## Call:
## lm(formula = log(Capital_Exp_Per_Capita) ~ Expenditure_Growth,
## data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.0133 -0.2399 0.1215 0.5026 1.1014
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.66341 0.37156 4.477 0.00421 **
## Expenditure_Growth 0.01226 0.01444 0.849 0.42829
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.049 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.1073, Adjusted R-squared: -0.04146
## F-statistic: 0.7213 on 1 and 6 DF, p-value: 0.4283
The linear regression results no statistically significant relationship and even after the log transformation the results still remain non-significant.
# no variables
# Expenditure Composition:
Cleaned_AMA_Data$CapExp_Pct <- (Cleaned_AMA_Data$Capital_Expenditure / Cleaned_AMA_Data$Total_Expenditure) * 100
Cleaned_AMA_Data$CapExp_Rev_Ratio <- (Cleaned_AMA_Data$Capital_Expenditure / Cleaned_AMA_Data$Total_Revenue)
# Expenditure Composition
ggplot(Cleaned_AMA_Data, aes(x = Year, y = CapExp_Pct)) +
geom_bar(stat = "identity", fill = "dodgerblue") +
geom_point()+
labs(title = "Capital Expenditure as Percentage of Total Expenditure",
x = "Year",
y = "Percentage") +
scale_y_continuous(labels = percent_format(accuracy = 1))
# Trends of Revenue and Expenditure over the years.
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Total_Revenue, color = "Total Revenue")) +
geom_point(aes(y = Total_Revenue)) + # Added aes(y = Total_Revenue)
geom_line(aes(y = Total_Expenditure, color = "Total Expenditure")) +
geom_point(aes(y = Total_Expenditure)) + # Added aes(y = Total_Expenditure)
labs(title = "Revenue and Expenditure Trends Over Years",
x = "Year",
y = "Amount (Ghana Cedis)", color = "Type") +
scale_color_manual(values = c("Total Revenue" = "blue", "Total Expenditure" = "red")) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Total_Revenue, color = "Total Revenue"), size = 1) +
geom_line(aes(y = IGF, color = "IGF"), size = 1) +
geom_line(aes(y = DACF, color = "DACF"), size = 1) +
geom_line(aes(y = Capital_Expenditure, color = "Capital Expenditure"), size = 1) +
geom_line(aes(y = Total_Expenditure, color = "Total Expenditure"), size = 1) +
geom_line(aes(y = Others_Sources, color = "Other Sources"), size = 1) +
labs(
title = "Revenue and Expenditure Trends Over Years",
x = "Year",
y = "Amount (Ghana Cedis)",
color = "Type"
) +
scale_color_manual(
values = c(
"Total Revenue" = "blue",
"Other Sources" = "skyblue",
"IGF" = "green",
"DACF" = "darkgray",
"Capital Expenditure" = "purple",
"Total Expenditure" = "red"
)
) +
scale_y_continuous(labels = scales::comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
# IGF to Total Expenditure Ratio
ggplot(Cleaned_AMA_Data, aes(x = Year, y = IGF_TE)) +
geom_line(color = "steelblue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "IGF to Total Expenditure Ratio Over Years",
x = "Year",
y = "Ratio (IGF/Total Expenditure)"
) +
scale_y_continuous(labels = percent_format(accuracy = 1))
# CapExp_Rev_Ratio plot.
ggplot(Cleaned_AMA_Data, aes(x = Year, y = CapExp_Rev_Ratio)) +
geom_line(color = "steelblue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "Capital Expenditure to Total Revenue Ratio Over Years",
x = "Year",
y = "Ratio (Capital Expenditure/Total Revenue)"
) +
scale_y_continuous(labels = comma)
cor.test(Cleaned_AMA_Data$Total_Expenditure, Cleaned_AMA_Data$Total_Revenue)
##
## Pearson's product-moment correlation
##
## data: Cleaned_AMA_Data$Total_Expenditure and Cleaned_AMA_Data$Total_Revenue
## t = 23.708, df = 7, p-value = 0.00000006037
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9698015 0.9987517
## sample estimates:
## cor
## 0.9938305
In the above plots, the Capital Expenditure as Percentage of Total Expenditure shows a slightly high capital investment with peak around 2018, followed by a sharp and sustained decline. Also, there is strong correlation between Total Revenue and Total Expenditure, with both peaking around 2017 and after.
# Revenue Per Capita
Cleaned_AMA_Data$Total_Revenue_Per_Capita <- Cleaned_AMA_Data$Total_Revenue / Cleaned_AMA_Data$Population
Cleaned_AMA_Data$IGF_Per_Capita <- Cleaned_AMA_Data$IGF / Cleaned_AMA_Data$Population
Cleaned_AMA_Data$DACF_Per_Capita <- Cleaned_AMA_Data$DACF / Cleaned_AMA_Data$Population
# Time Series Plots (Improved)
# Total Revenue and Expenditure Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Total_Revenue, color = "Total Revenue"), size = 1) +
geom_point(aes(y = Total_Revenue, color = "Total Revenue")) +
geom_line(aes(y = IGF, color = "IGF"), size = 1) +
geom_point(aes(y = IGF, color = "IGF")) +
geom_line(aes(y = DACF, color = "DACF"), size = 1) +
geom_point(aes(y = DACF, color = "DACF")) +
geom_line(aes(y = Capital_Expenditure, color = "Capital Expenditure"), size = 1) +
geom_point(aes(y = Capital_Expenditure, color = "Capital Expenditure")) +
geom_line(aes(y = Total_Expenditure, color = "Total Expenditure"), size = 1) +
geom_point(aes(y = Total_Expenditure, color = "Total Expenditure")) +
geom_line(aes(y = Others_Sources, color = "Other Sources"), size = 1) +
geom_point(aes(y = Others_Sources, color = "Other Sources")) +
labs(
title = "Revenue and Expenditure Trends Over Years",
x = "Year",
y = "Amount (Ghana Cedis)",
color = "Type"
) +
scale_color_manual(
values = c(
"Total Revenue" = "blue",
"Other Sources" = "skyblue",
"IGF" = "green",
"DACF" = "darkgray",
"Capital Expenditure" = "purple",
"Total Expenditure" = "red"
)
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
# Population Trend
ggplot(Cleaned_AMA_Data, aes(x = Year, y = Population)) +
geom_line(color = "steelblue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "Population Trend Over Years",
x = "Year",
y = "Population"
) +
scale_y_continuous(labels = comma) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold")
)
# IGF to Total Expenditure Ratio
ggplot(Cleaned_AMA_Data, aes(x = Year, y = IGF_TE)) +
geom_line(color = "steelblue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "IGF to Total Expenditure Ratio Over Years",
x = "Year",
y = "Ratio (IGF/Total Expenditure)"
) +
scale_y_continuous(labels = percent_format(accuracy = 1)) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold")
)
# Per capita plot
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Total_Revenue_Per_Capita, color = "Total Revenue Per Capita")) +
geom_point(aes(y = Total_Revenue_Per_Capita, color = "Total Revenue Per Capita")) +
geom_line(aes(y = IGF_Per_Capita, color = "IGF Per Capita")) +
geom_point(aes(y = IGF_Per_Capita, color = "IGF Per Capita")) +
geom_line(aes(y = DACF_Per_Capita, color = "DACF Per Capita")) +
geom_point(aes(y = DACF_Per_Capita, color = "DACF Per Capita")) +
labs(title = "Revenue Per Capita trends", x = "Year", y = "Amount (Ghana Cedis)", color = "Type") +
scale_y_continuous(labels = comma)
cor_matrix <- cor(Cleaned_AMA_Data[, c("Population", "Total_Revenue", "Total_Expenditure", "IGF_TE", "CapExp_Pct", "IGF")], use = "complete.obs")
print(cor_matrix)
## Population Total_Revenue Total_Expenditure IGF_TE
## Population 1.0000000 0.5363112 0.5632552 0.4315557
## Total_Revenue 0.5363112 1.0000000 0.9938305 0.5808075
## Total_Expenditure 0.5632552 0.9938305 1.0000000 0.5475533
## IGF_TE 0.4315557 0.5808075 0.5475533 1.0000000
## CapExp_Pct 0.6080252 0.7932208 0.8353768 0.6303333
## IGF 0.5482297 0.9341365 0.9195661 0.8180036
## CapExp_Pct IGF
## Population 0.6080252 0.5482297
## Total_Revenue 0.7932208 0.9341365
## Total_Expenditure 0.8353768 0.9195661
## IGF_TE 0.6303333 0.8180036
## CapExp_Pct 1.0000000 0.8079598
## IGF 0.8079598 1.0000000
corrplot(cor_matrix, main = "Correlation matrix of population and expenditure patterns")
In the above there is a strong positive correlation between total revenue and total expenditure and alo between IGF.
# Total Revenue vs Population
model_revenue_pop <- lm(Total_Revenue ~ Population, data = Cleaned_AMA_Data)
summary(model_revenue_pop)
##
## Call:
## lm(formula = Total_Revenue ~ Population, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29421174 -9145469 -1831067 15761302 30092080
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40246399.97 17978138.90 2.239 0.0602 .
## Population 17.13 10.19 1.681 0.1366
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21800000 on 7 degrees of freedom
## Multiple R-squared: 0.2876, Adjusted R-squared: 0.1859
## F-statistic: 2.826 on 1 and 7 DF, p-value: 0.1366
# Total Expenditure vs Population
model_expenditure_pop <- lm(Total_Expenditure ~ Population, data = Cleaned_AMA_Data)
summary(model_expenditure_pop)
##
## Call:
## lm(formula = Total_Expenditure ~ Population, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27719681 -11976200 -998826 17534296 24229751
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37555261.801 17222106.762 2.181 0.0656 .
## Population 17.606 9.762 1.804 0.1143
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 20880000 on 7 degrees of freedom
## Multiple R-squared: 0.3173, Adjusted R-squared: 0.2197
## F-statistic: 3.253 on 1 and 7 DF, p-value: 0.1143
# Capital Expenditure vs Total Revenue and IGF_TE
model_capital_rev_igf <- lm(Capital_Expenditure ~ Total_Revenue + IGF_TE, data = Cleaned_AMA_Data)
summary(model_capital_rev_igf)
##
## Call:
## lm(formula = Capital_Expenditure ~ Total_Revenue + IGF_TE, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11115284 -1579096 -1347879 3150098 7196110
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17083481.3394 9933027.7839 -1.720 0.1362
## Total_Revenue 0.3807 0.1123 3.391 0.0147 *
## IGF_TE 8799506.4086 25967153.1929 0.339 0.7463
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6247000 on 6 degrees of freedom
## Multiple R-squared: 0.7651, Adjusted R-squared: 0.6867
## F-statistic: 9.769 on 2 and 6 DF, p-value: 0.01297
# IGF_TE vs Population and Total Revenue
model_igfte_pop_rev <- lm(IGF_TE ~ Population + Total_Revenue, data = Cleaned_AMA_Data)
summary(model_igfte_pop_rev)
##
## Call:
## lm(formula = IGF_TE ~ Population + Total_Revenue, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.11347 -0.06445 0.02422 0.04453 0.11092
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.276691517224 0.104452816859 2.649 0.0381 *
## Population 0.000000023280 0.000000053550 0.435 0.6790
## Total_Revenue 0.000000002121 0.000000001676 1.265 0.2528
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09669 on 6 degrees of freedom
## Multiple R-squared: 0.3576, Adjusted R-squared: 0.1434
## F-statistic: 1.67 on 2 and 6 DF, p-value: 0.2651
# Visualizations
# Scatter plot: Total Revenue vs Population
ggplot(Cleaned_AMA_Data, aes(x = Population, y = Total_Revenue)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Total Revenue vs Population", x = "Population", y = "Total Revenue") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
# Scatter plot: Total Expenditure vs Population
ggplot(Cleaned_AMA_Data, aes(x = Population, y = Total_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Total Expenditure vs Population", x = "Population", y = "Total Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
# Scatter plot: Capital Expenditure vs Total Revenue
ggplot(Cleaned_AMA_Data, aes(x = Total_Revenue, y = Capital_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Capital Expenditure vs Total Revenue", x = "Total Revenue", y = "Capital Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
# Scatter plot: IGF_TE vs Population
ggplot(Cleaned_AMA_Data, aes(x = Population, y = IGF_TE)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF_TE vs Population", x = "Population", y = "IGF_TE") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = percent_format(accuracy = 1))
ggplot(Cleaned_AMA_Data, aes(x = Total_Revenue, y = IGF_TE)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF_TE vs Total Revenue", x = "Total Revenue", y = "IGF_TE") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = percent_format(accuracy = 1))
In the regression results above, we found no significant relationship between between Total Revenue and Population, Total Expenditure and Population, and Capital Expenditure and Total Revenue. However in between IGF_TE vs Population and Total Revenue. It was found that Total Revenue was significant.
# no variables
# IGF Trend
ggplot(Cleaned_AMA_Data, aes(x = Year, y = IGF)) +
geom_line(color = "blue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "IGF Trend Over Years",
x = "Year",
y = "IGF (Ghana Cedis)"
) +
scale_y_continuous(labels = comma) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold")
)
# Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_point(aes(y = Act_Permit, color = "Permit Fees")) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_point(aes(y = Act_Property_Rates, color = "Property Rates")) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_point(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue")) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_point(aes(y = Act_Licenses, color = "Licenses")) +
geom_line(aes(y = Act_Fees, color = "Act Fees"), size = 1) +
geom_point(aes(y = Act_Fees, color = "Act Fees")) +
labs(
title = "Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
scale_color_brewer(palette = "Set1")+
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
# IGF and Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = IGF, color = "IGF"), size = 1) +
geom_point(aes(y = IGF, color = "IGF")) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_point(aes(y = Act_Permit, color = "Permit Fees")) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_point(aes(y = Act_Property_Rates, color = "Property Rates")) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_point(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue")) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_point(aes(y = Act_Licenses, color = "Licenses")) +
geom_line(aes(y = Act_Fees, color = "Act Fees"), size = 1) +
geom_point(aes(y = Act_Fees, color = "Act Fees")) +
labs(
title = "IGF vs. Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
scale_color_brewer(palette = "Set1")+
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
The above shows the trends relationships.
# IGF vs Land-Based Revenues
model_igf_land <- lm(IGF ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands + Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
summary(model_igf_land)
##
## Call:
## lm(formula = IGF ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands +
## Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## 1 2 3 4 5 6 7 8
## -289040 -161318 -106231 -195186 815665 -554989 2086435 -1595337
## attr(,"label")
## [1] "IGF"
## attr(,"format.spss")
## [1] "F8.0"
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1175831.1596 2380655.4331 0.494 0.6703
## Act_Permit 2.1687 0.6231 3.480 0.0736 .
## Act_Property_Rates -0.6037 0.9474 -0.637 0.5892
## Act_Stool_Lands 6.9193 4.5629 1.516 0.2687
## Act_Licenses 2.0198 0.6858 2.945 0.0985 .
## Act_Fees 1.1865 0.6947 1.708 0.2298
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2004000 on 2 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.9953, Adjusted R-squared: 0.9835
## F-statistic: 84.23 on 5 and 2 DF, p-value: 0.01177
# Visualizations
# Scatter plots (IGF vs each land-based revenue)
ggplot(Cleaned_AMA_Data, aes(x = Act_Permit, y = IGF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF vs Permit Fees", x = "Permit Fees", y = "IGF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Property_Rates, y = IGF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF vs Property Rates", x = "Property Rates", y = "IGF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Stool_Lands, y = IGF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF vs Stool Lands Revenue", x = "Stool Lands Revenue", y = "IGF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Licenses, y = IGF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF vs Licenses", x = "Licenses", y = "IGF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Fees, y = IGF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "IGF vs Act Fees", x = "Act Fees", y = "IGF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
cor_matrix_land_igf <- cor(Cleaned_AMA_Data[, c("IGF", "Act_Permit", "Act_Property_Rates", "Act_Stool_Lands", "Act_Licenses", "Act_Fees")], use = "complete.obs")
print(cor_matrix_land_igf)
## IGF Act_Permit Act_Property_Rates Act_Stool_Lands
## IGF 1.0000000 0.9184820 0.9792669 0.22060788
## Act_Permit 0.9184820 1.0000000 0.9024330 0.10633685
## Act_Property_Rates 0.9792669 0.9024330 1.0000000 0.24118433
## Act_Stool_Lands 0.2206079 0.1063368 0.2411843 1.00000000
## Act_Licenses 0.8367159 0.5755145 0.8421432 0.21880908
## Act_Fees 0.8692872 0.7497758 0.8397400 0.04329437
## Act_Licenses Act_Fees
## IGF 0.8367159 0.86928715
## Act_Permit 0.5755145 0.74977580
## Act_Property_Rates 0.8421432 0.83973997
## Act_Stool_Lands 0.2188091 0.04329437
## Act_Licenses 1.0000000 0.74735384
## Act_Fees 0.7473538 1.00000000
corrplot(cor_matrix_land_igf)
The multiple regression results of all the land-based revenues (permit fees, property rates, rents, stool lands revenue, licenses) and revenue (IGF) is statistically significant with a very high R-squared of 0.9953, means 99.53% of variation in the IGF is explained by the land-based revenues (permit fees, property rates, rents, stool lands revenue, licenses). However the individual terms in the model are non-significant.
The correlation matrix shows that IGF is strongly correlated with all the land-based revenues except stool lands revenue (0.2206).
# Simple linear Regression Analysis
model_permit <- lm(IGF ~ Act_Permit, data = Cleaned_AMA_Data)
model_property <- lm(IGF ~ Act_Property_Rates, data = Cleaned_AMA_Data)
model_stool <- lm(IGF ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
model_license <- lm(IGF ~ Act_Licenses, data = Cleaned_AMA_Data)
model_acts <- lm(IGF ~ Act_Fees, data = Cleaned_AMA_Data)
summary(model_permit)
##
## Call:
## lm(formula = IGF ~ Act_Permit, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10290133 -1663450 -947668 1168902 10357098
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12454615.0192 3494052.1244 3.565 0.009164 **
## Act_Permit 2.8513 0.4245 6.717 0.000273 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6169000 on 7 degrees of freedom
## Multiple R-squared: 0.8657, Adjusted R-squared: 0.8465
## F-statistic: 45.12 on 1 and 7 DF, p-value: 0.0002732
summary(model_property)
##
## Call:
## lm(formula = IGF ~ Act_Property_Rates, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3665277 -2672268 -54761 2408379 4812100
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5515997.3941 2145320.5520 2.571 0.0369 *
## Act_Property_Rates 2.9291 0.2113 13.860 0.0000024 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3156000 on 7 degrees of freedom
## Multiple R-squared: 0.9648, Adjusted R-squared: 0.9598
## F-statistic: 192.1 on 1 and 7 DF, p-value: 0.000002405
summary(model_stool)
##
## Call:
## lm(formula = IGF ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17412695 -11558375 -174585 6318832 23436450
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 31517236.88 6739651.35 4.676 0.00341 **
## Act_Stool_Lands 17.65 31.86 0.554 0.59958
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16410000 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.04867, Adjusted R-squared: -0.1099
## F-statistic: 0.3069 on 1 and 6 DF, p-value: 0.5996
summary(model_license)
##
## Call:
## lm(formula = IGF ~ Act_Licenses, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8823408 -7597936 -1026747 -552672 15575497
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4376125.4419 7415024.3626 0.590 0.5736
## Act_Licenses 3.8698 0.9643 4.013 0.0051 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9265000 on 7 degrees of freedom
## Multiple R-squared: 0.697, Adjusted R-squared: 0.6538
## F-statistic: 16.1 on 1 and 7 DF, p-value: 0.005104
summary(model_acts)
##
## Call:
## lm(formula = IGF ~ Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12794687 -4538246 -489293 6534758 12120713
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5742595.609 9762959.529 -0.588 0.5749
## Act_Fees 6.567 1.636 4.014 0.0051 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9264000 on 7 degrees of freedom
## Multiple R-squared: 0.6971, Adjusted R-squared: 0.6538
## F-statistic: 16.11 on 1 and 7 DF, p-value: 0.005101
The simple linear regression analysis of the land-based revenues found all of them to be significant except stool lands revenue.
# DACF Trend
ggplot(Cleaned_AMA_Data, aes(x = Year, y = DACF)) +
geom_line(color = "blue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "DACF Trend Over Years",
x = "Year",
y = "DACF (Ghana Cedis)"
) +
scale_y_continuous(labels = comma) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold")
)
# Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_line(aes(y = Act_Fees, color = "Act_Fees"), size = 1) +
labs(
title = "Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
#DACF and Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_line(aes(y = Act_Fees, color = "Act_Fees"), size = 1) +
geom_line(aes(y = DACF, color = "DACF"), size = 1) +
labs(
title = "DACF vs.Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
The above shows the trends relationships.
# DACF vs Land-Based Revenues
model_DACF_land <- lm(DACF ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands + Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
summary(model_DACF_land)
##
## Call:
## lm(formula = DACF ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands +
## Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## 1 2 3 4 5 6 7 8
## -439428 -237721 -210689 -170899 1170957 -1072981 3257830 -2297069
## attr(,"label")
## [1] "DACF"
## attr(,"format.spss")
## [1] "F8.0"
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5134901.7694 3636277.5367 1.412 0.293
## Act_Permit -0.5390 0.9518 -0.566 0.628
## Act_Property_Rates 0.9706 1.4470 0.671 0.571
## Act_Stool_Lands 7.7684 6.9694 1.115 0.381
## Act_Licenses -1.3473 1.0476 -1.286 0.327
## Act_Fees 0.7285 1.0611 0.687 0.563
##
## Residual standard error: 3061000 on 2 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.6754, Adjusted R-squared: -0.1362
## F-statistic: 0.8322 on 5 and 2 DF, p-value: 0.6251
# Visualizations
# Scatter plots (DACF vs each land-based revenue)
ggplot(Cleaned_AMA_Data, aes(x = Act_Permit, y = DACF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "DACF vs Permit Fees", x = "Permit Fees", y = "DACF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Property_Rates, y = DACF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "DACF vs Property Rates", x = "Property Rates", y = "DACF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Stool_Lands, y = DACF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "DACF vs Stool Lands Revenue", x = "Stool Lands Revenue", y = "DACF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Licenses, y = DACF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "DACF vs Licenses", x = "Licenses", y = "DACF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Fees, y = DACF)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "DACF vs Act Fees", x = "Act Fees", y = "DACF") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
cor_matrix_land_DACF <- cor(Cleaned_AMA_Data[, c("DACF", "Act_Permit", "Act_Property_Rates", "Act_Stool_Lands", "Act_Licenses", "Act_Fees")], use = "complete.obs")
print(cor_matrix_land_DACF)
## DACF Act_Permit Act_Property_Rates Act_Stool_Lands
## DACF 1.00000000 0.1409040 0.08019837 0.50892571
## Act_Permit 0.14090405 1.0000000 0.90243297 0.10633685
## Act_Property_Rates 0.08019837 0.9024330 1.00000000 0.24118433
## Act_Stool_Lands 0.50892571 0.1063368 0.24118433 1.00000000
## Act_Licenses -0.21252830 0.5755145 0.84214325 0.21880908
## Act_Fees 0.08350851 0.7497758 0.83973997 0.04329437
## Act_Licenses Act_Fees
## DACF -0.2125283 0.08350851
## Act_Permit 0.5755145 0.74977580
## Act_Property_Rates 0.8421432 0.83973997
## Act_Stool_Lands 0.2188091 0.04329437
## Act_Licenses 1.0000000 0.74735384
## Act_Fees 0.7473538 1.00000000
corrplot(cor_matrix_land_DACF)
The multiple regression results of all the land-based revenues (permit fees, property rates, rents, stool lands revenue, licenses) and revenue (DACF) is not statistically significant with R-squared of 0.6754 and Adjusted R-squared of -0.1362 means a poor model and does not fit.
The correlation matrix shows that DACF is very weakly correlated with all the land-based revenues and even negative for Act_Licenses (-0.2125)
# Simple linear Regression Analysis
model_permit <- lm(DACF ~ Act_Permit, data = Cleaned_AMA_Data)
model_property <- lm(DACF ~ Act_Property_Rates, data = Cleaned_AMA_Data)
model_stool <- lm(DACF ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
model_license <- lm(DACF ~ Act_Licenses, data = Cleaned_AMA_Data)
model_acts <- lm(DACF ~ Act_Fees, data = Cleaned_AMA_Data)
summary(model_permit)
##
## Call:
## lm(formula = DACF ~ Act_Permit, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3959742 -2065929 -112438 2333148 3486809
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5080935.67899 1610414.00890 3.155 0.016 *
## Act_Permit 0.08371 0.19565 0.428 0.682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2843000 on 7 degrees of freedom
## Multiple R-squared: 0.02549, Adjusted R-squared: -0.1137
## F-statistic: 0.1831 on 1 and 7 DF, p-value: 0.6816
summary(model_property)
##
## Call:
## lm(formula = DACF ~ Act_Property_Rates, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4044314 -1771116 -272358 2251891 3711498
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5170032.7497 1947175.0700 2.655 0.0327 *
## Act_Property_Rates 0.0529 0.1918 0.276 0.7907
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2865000 on 7 degrees of freedom
## Multiple R-squared: 0.01075, Adjusted R-squared: -0.1306
## F-statistic: 0.07606 on 1 and 7 DF, p-value: 0.7907
summary(model_stool)
##
## Call:
## lm(formula = DACF ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3247424 -1461841 -934042 1715144 3516331
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4900423.081 1096272.898 4.470 0.00424 **
## Act_Stool_Lands 7.505 5.182 1.448 0.19773
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2670000 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.259, Adjusted R-squared: 0.1355
## F-statistic: 2.097 on 1 and 6 DF, p-value: 0.1977
summary(model_license)
##
## Call:
## lm(formula = DACF ~ Act_Licenses, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3591623 -2401501 -836409 2434841 3430832
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6701775.3160 2262264.9865 2.962 0.021 *
## Act_Licenses -0.1522 0.2942 -0.517 0.621
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2827000 on 7 degrees of freedom
## Multiple R-squared: 0.03681, Adjusted R-squared: -0.1008
## F-statistic: 0.2675 on 1 and 7 DF, p-value: 0.6209
summary(model_acts)
##
## Call:
## lm(formula = DACF ~ Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3755535 -1925880 -501374 2406473 3848641
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4953914.0247 3022950.3272 1.639 0.145
## Act_Fees 0.1209 0.5066 0.239 0.818
##
## Residual standard error: 2868000 on 7 degrees of freedom
## Multiple R-squared: 0.008064, Adjusted R-squared: -0.1336
## F-statistic: 0.05691 on 1 and 7 DF, p-value: 0.8183
The simple linear regression analysis of the land-based revenues found none of them to be significant.
# Capital_Expenditure Trend
ggplot(Cleaned_AMA_Data, aes(x = Year, y = Capital_Expenditure)) +
geom_line(color = "blue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "Capital Expenditure Trend Over Years",
x = "Year",
y = "Capital_Expenditure (Ghana Cedis)"
) +
scale_y_continuous(labels = comma) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold")
)
# Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_line(aes(y = Act_Fees, color = "Act_Fees"), size = 1) +
labs(
title = "Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
#Capital_Expenditure and Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_line(aes(y = Act_Fees, color = "Act_Fees"), size = 1) +
geom_line(aes(y = Capital_Expenditure, color = "Capital_Expenditure"), size = 1) +
labs(
title = "Capital Exp. vs.Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
The above shows the trends relationships.
# Capital_Expenditure vs Land-Based Revenues
model_Capital_Expenditure_land <- lm(Capital_Expenditure ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands + Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
summary(model_Capital_Expenditure_land)
##
## Call:
## lm(formula = Capital_Expenditure ~ Act_Permit + Act_Property_Rates +
## Act_Stool_Lands + Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## 1 2 3 4 5 6 7 8
## -290277 -325783 962890 -2932609 2321857 4427427 229294 -4392800
## attr(,"label")
## [1] "Capital Expenditure"
## attr(,"format.spss")
## [1] "F8.0"
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2278509.3718 6176772.5909 -0.369 0.748
## Act_Permit -1.8107 1.6168 -1.120 0.379
## Act_Property_Rates 4.6281 2.4580 1.883 0.200
## Act_Stool_Lands 14.9561 11.8386 1.263 0.334
## Act_Licenses -2.9286 1.7795 -1.646 0.242
## Act_Fees 0.9231 1.8024 0.512 0.659
##
## Residual standard error: 5199000 on 2 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.9354, Adjusted R-squared: 0.774
## F-statistic: 5.794 on 5 and 2 DF, p-value: 0.1537
# Visualizations
# Scatter plots (Capital_Expenditure vs each land-based revenue)
ggplot(Cleaned_AMA_Data, aes(x = Act_Permit, y = Capital_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Capital_Expenditure vs Permit Fees", x = "Permit Fees", y = "Capital_Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Property_Rates, y = Capital_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Capital_Expenditure vs Property Rates", x = "Property Rates", y = "Capital_Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Stool_Lands, y = Capital_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Capital_Expenditure vs Stool Lands Revenue", x = "Stool Lands Revenue", y = "Capital_Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Licenses, y = Capital_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Capital_Expenditure vs Licenses", x = "Licenses", y = "Capital_Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Fees, y = Capital_Expenditure)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Capital_Expenditure vs Act Fees", x = "Act Fees", y = "Capital_Expenditure") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
cor_matrix_land_Capital_Expenditure <- cor(Cleaned_AMA_Data[, c("Capital_Expenditure", "Act_Permit", "Act_Property_Rates", "Act_Stool_Lands", "Act_Licenses", "Act_Fees")], use = "complete.obs")
print(cor_matrix_land_Capital_Expenditure)
## Capital_Expenditure Act_Permit Act_Property_Rates
## Capital_Expenditure 1.0000000 0.7676261 0.8627055
## Act_Permit 0.7676261 1.0000000 0.9024330
## Act_Property_Rates 0.8627055 0.9024330 1.0000000
## Act_Stool_Lands 0.5085570 0.1063368 0.2411843
## Act_Licenses 0.6154982 0.5755145 0.8421432
## Act_Fees 0.7035596 0.7497758 0.8397400
## Act_Stool_Lands Act_Licenses Act_Fees
## Capital_Expenditure 0.50855700 0.6154982 0.70355965
## Act_Permit 0.10633685 0.5755145 0.74977580
## Act_Property_Rates 0.24118433 0.8421432 0.83973997
## Act_Stool_Lands 1.00000000 0.2188091 0.04329437
## Act_Licenses 0.21880908 1.0000000 0.74735384
## Act_Fees 0.04329437 0.7473538 1.00000000
corrplot(cor_matrix_land_Capital_Expenditure)
The multiple regression results of all the land-based revenues (permit fees, property rates, rents, stool lands revenue, licenses) and revenue (Capital_Expenditure) is not statistically significant with p-value (0.1537), R-squared of 0.9354 and Adjusted R-squared of 0.774 .
The correlation matrix shows that Capital_Expenditure shows moderately correlated with all the land-based revenues and the highest was Act_Property_Rates (0.8627).
# Simple linear Regression Analysis
model_permit <- lm(Capital_Expenditure ~ Act_Permit, data = Cleaned_AMA_Data)
model_property <- lm(Capital_Expenditure ~ Act_Property_Rates, data = Cleaned_AMA_Data)
model_stool <- lm(Capital_Expenditure ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
model_license <- lm(Capital_Expenditure ~ Act_Licenses, data = Cleaned_AMA_Data)
model_acts <- lm(Capital_Expenditure ~ Act_Fees, data = Cleaned_AMA_Data)
summary(model_permit)
##
## Call:
## lm(formula = Capital_Expenditure ~ Act_Permit, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11639140 -2390386 -2163765 4312870 9866492
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1178816.3655 4021127.3597 0.293 0.77790
## Act_Permit 1.7458 0.4885 3.574 0.00905 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7099000 on 7 degrees of freedom
## Multiple R-squared: 0.6459, Adjusted R-squared: 0.5954
## F-statistic: 12.77 on 1 and 7 DF, p-value: 0.009053
summary(model_property)
##
## Call:
## lm(formula = Capital_Expenditure ~ Act_Property_Rates, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8327967 -2551875 739936 1698513 8246275
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3727439.691 3796444.590 -0.982 0.35890
## Act_Property_Rates 1.868 0.374 4.994 0.00158 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5585000 on 7 degrees of freedom
## Multiple R-squared: 0.7809, Adjusted R-squared: 0.7496
## F-statistic: 24.94 on 1 and 7 DF, p-value: 0.001575
summary(model_stool)
##
## Call:
## lm(formula = Capital_Expenditure ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10809249 -6236120 -2126552 6446824 14548237
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11211819.21 4176089.15 2.685 0.0363 *
## Act_Stool_Lands 28.56 19.74 1.447 0.1981
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10170000 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.2586, Adjusted R-squared: 0.1351
## F-statistic: 2.093 on 1 and 6 DF, p-value: 0.1981
summary(model_license)
##
## Call:
## lm(formula = Capital_Expenditure ~ Act_Licenses, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8100589 -6637123 -2270071 2953494 19335590
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1746902.1264 7390684.5184 -0.236 0.8199
## Act_Licenses 2.0804 0.9611 2.165 0.0672 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9234000 on 7 degrees of freedom
## Multiple R-squared: 0.4009, Adjusted R-squared: 0.3154
## F-statistic: 4.685 on 1 and 7 DF, p-value: 0.06716
summary(model_acts)
##
## Call:
## lm(formula = Capital_Expenditure ~ Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10118563 -6155074 -2098420 3919519 13132755
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8589611.061 9245961.154 -0.929 0.3838
## Act_Fees 3.778 1.550 2.438 0.0449 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8774000 on 7 degrees of freedom
## Multiple R-squared: 0.4593, Adjusted R-squared: 0.382
## F-statistic: 5.945 on 1 and 7 DF, p-value: 0.04487
The simple linear regression analysis of the land-based revenues found permit fees, property rates, and Actual Fees to be significant, the rest were not.
#The recurrent expenditure is all NA
# Population Trend
ggplot(Cleaned_AMA_Data, aes(x = Year, y = Population)) +
geom_line(color = "blue", size = 1) +
geom_point(size = 2.5) +
labs(
title = "Population Trend Over Years",
x = "Year",
y = "Population (Ghana Cedis)"
) +
scale_y_continuous(labels = comma) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold")
)
# Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_line(aes(y = Act_Fees, color = "Act_Fees"), size = 1) +
labs(
title = "Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
#Population and Land-Based Revenue Trends
ggplot(Cleaned_AMA_Data, aes(x = Year)) +
geom_line(aes(y = Act_Permit, color = "Permit Fees"), size = 1) +
geom_line(aes(y = Act_Property_Rates, color = "Property Rates"), size = 1) +
geom_line(aes(y = Act_Stool_Lands, color = "Stool Lands Revenue"), size = 1) +
geom_line(aes(y = Act_Licenses, color = "Licenses"), size = 1) +
geom_line(aes(y = Act_Fees, color = "Act_Fees"), size = 1) +
geom_line(aes(y = Population, color = "Population"), size = 1) +
labs(
title = "Population vs.Land-Based Revenue Trends Over Years",
x = "Year",
y = "Revenue (Ghana Cedis)",
color = "Revenue Type"
) +
scale_y_continuous(labels = comma) +
theme(
legend.position = "right",
legend.title = element_text(face = "bold"),
plot.title = element_text(hjust = 0.5, face = "bold")
)
The above shows the trends relationships.
# Population vs Land-Based Revenues
model_Population_land <- lm(Population ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands + Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
summary(model_Population_land)
##
## Call:
## lm(formula = Population ~ Act_Permit + Act_Property_Rates + Act_Stool_Lands +
## Act_Licenses + Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## 1 2 3 4 5 6 7 8
## -112743 -81093 77221 -379728 484867 336527 606817 -931868
## attr(,"label")
## [1] "Population"
## attr(,"format.spss")
## [1] "F8.0"
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1639733.9905 1112721.8535 1.474 0.278
## Act_Permit 0.1877 0.2913 0.644 0.585
## Act_Property_Rates -0.1650 0.4428 -0.373 0.745
## Act_Stool_Lands 0.3929 2.1327 0.184 0.871
## Act_Licenses 0.1441 0.3206 0.450 0.697
## Act_Fees -0.1305 0.3247 -0.402 0.727
##
## Residual standard error: 936600 on 2 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.3289, Adjusted R-squared: -1.349
## F-statistic: 0.1961 on 5 and 2 DF, p-value: 0.9379
# Visualizations
# Scatter plots (Population vs each land-based revenue)
ggplot(Cleaned_AMA_Data, aes(x = Act_Permit, y = Population)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Population vs Permit Fees", x = "Permit Fees", y = "Population") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Property_Rates, y = Population)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Population vs Property Rates", x = "Property Rates", y = "Population") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Stool_Lands, y = Population)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Population vs Stool Lands Revenue", x = "Stool Lands Revenue", y = "Population") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Licenses, y = Population)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Population vs Licenses", x = "Licenses", y = "Population") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
ggplot(Cleaned_AMA_Data, aes(x = Act_Fees, y = Population)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE) +
labs(title = "Population vs Act Fees", x = "Act Fees", y = "Population") +
scale_x_continuous(labels = comma) +
scale_y_continuous(labels = comma)
cor_matrix_land_Population <- cor(Cleaned_AMA_Data[, c("Population", "Act_Permit", "Act_Property_Rates", "Act_Stool_Lands", "Act_Licenses", "Act_Fees")], use = "complete.obs")
print(cor_matrix_land_Population)
## Population Act_Permit Act_Property_Rates Act_Stool_Lands
## Population 1.0000000 0.4390261 0.3497229 0.11494984
## Act_Permit 0.4390261 1.0000000 0.9024330 0.10633685
## Act_Property_Rates 0.3497229 0.9024330 1.0000000 0.24118433
## Act_Stool_Lands 0.1149498 0.1063368 0.2411843 1.00000000
## Act_Licenses 0.2354682 0.5755145 0.8421432 0.21880908
## Act_Fees 0.1607198 0.7497758 0.8397400 0.04329437
## Act_Licenses Act_Fees
## Population 0.2354682 0.16071981
## Act_Permit 0.5755145 0.74977580
## Act_Property_Rates 0.8421432 0.83973997
## Act_Stool_Lands 0.2188091 0.04329437
## Act_Licenses 1.0000000 0.74735384
## Act_Fees 0.7473538 1.00000000
corrplot(cor_matrix_land_Population)
The multiple regression results of all the land-based revenues (permit fees, property rates, rents, stool lands revenue, act fees, licenses) and Population is not statistically significant with R-squared of 0.3289, and Adjusted R-squared of -1.349 means a poor model and does not fit.
The correlation matrix shows that Population is very weakly correlated with all the land-based revenues.
# Simple linear Regression Analysis
model_permit <- lm(Population ~ Act_Permit, data = Cleaned_AMA_Data)
model_property <- lm(Population ~ Act_Property_Rates, data = Cleaned_AMA_Data)
model_stool <- lm(Population ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
model_license <- lm(Population ~ Act_Licenses, data = Cleaned_AMA_Data)
model_acts <- lm(Population ~ Act_Fees, data = Cleaned_AMA_Data)
summary(model_permit)
##
## Call:
## lm(formula = Population ~ Act_Permit, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -881384 -354792 132419 250958 950927
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1064740.26150 379359.37911 2.807 0.0263 *
## Act_Permit 0.08249 0.04609 1.790 0.1166
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 669800 on 7 degrees of freedom
## Multiple R-squared: 0.314, Adjusted R-squared: 0.2159
## F-statistic: 3.203 on 1 and 7 DF, p-value: 0.1166
summary(model_property)
##
## Call:
## lm(formula = Population ~ Act_Property_Rates, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -981516 -86579 68457 134937 980503
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 985676.47224 477416.32470 2.065 0.0778 .
## Act_Property_Rates 0.07099 0.04703 1.509 0.1749
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 702400 on 7 degrees of freedom
## Multiple R-squared: 0.2455, Adjusted R-squared: 0.1378
## F-statistic: 2.278 on 1 and 7 DF, p-value: 0.1749
summary(model_stool)
##
## Call:
## lm(formula = Population ~ Act_Stool_Lands, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1461810 82662 193226 291636 384290
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1739981.1862 269251.5811 6.462 0.000651 ***
## Act_Stool_Lands 0.3608 1.2728 0.283 0.786360
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 655700 on 6 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.01321, Adjusted R-squared: -0.1513
## F-statistic: 0.08034 on 1 and 6 DF, p-value: 0.7864
summary(model_license)
##
## Call:
## lm(formula = Population ~ Act_Licenses, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1196191 32760 103624 515365 803061
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1136758.40669 616045.44957 1.845 0.108
## Act_Licenses 0.06822 0.08011 0.852 0.423
##
## Residual standard error: 769700 on 7 degrees of freedom
## Multiple R-squared: 0.09387, Adjusted R-squared: -0.03558
## F-statistic: 0.7251 on 1 and 7 DF, p-value: 0.4226
summary(model_acts)
##
## Call:
## lm(formula = Population ~ Act_Fees, data = Cleaned_AMA_Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1288938 166437 274841 368189 688709
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1231094.66893 838411.81878 1.468 0.185
## Act_Fees 0.06759 0.14051 0.481 0.645
##
## Residual standard error: 795600 on 7 degrees of freedom
## Multiple R-squared: 0.032, Adjusted R-squared: -0.1063
## F-statistic: 0.2314 on 1 and 7 DF, p-value: 0.6452
The simple linear regression analysis of the land-based revenues found none of them to be significant.
# no variables
# no variables